feat: clean the config and fix the manifests generation#1953
Conversation
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 53 out of 56 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (10)
config/rbac/kustomization.yaml:1
- The comment states leader-election RBAC is 'intentionally disabled' and that 'the production deployment does not enable --leader-elect', but the rendered Deployment in
manifests/setup/setup.yaml(andconfig/manager/manager.yaml) passes--leader-elect, and these two leader-election resources are actively included (not commented out). Either remove the contradictory comment or rewrite it to reflect that leader-election RBAC is enabled to match the manager args.
tests/scripts/fluentd_e2e.sh:1 - Using a literal
\\nin the sed replacement to inject a newimagePullPolicy:line is a GNU sed extension; BSD/macOS sed will emit a literalninstead of a newline and produce invalid YAML, breaking the script for developers running e2e locally on macOS. Consider either using two separate sed expressions where the second usesa\\to append a line, piping throughawk, or constructing the multi-line replacement using a literal newline inside the shell string ($'\\n'). At minimum, document the GNU-sed dependency.
manifests/setup/setup.yaml:1 - The manager is started with
--metrics-bind-address=:8443but the container declares nocontainerPortfor 8443 and the metrics Service / NetworkPolicy / ServiceMonitor sections inconfig/default/kustomization.yamlare all commented out. The metrics endpoint will therefore be unreachable from outside the pod. If metrics aren't intended to be exposed in the default bundle, consider setting--metrics-bind-address=0(disabled) to avoid binding an unused port; if they are intended, expose the port and enable the metrics Service.
manifests/setup/setup.yaml:1 readOnlyRootFilesystem: trueis set, but the manager binary currently writes leader-election cache files (controller-runtime writes to its temp dir) and may rely on/tmp. Without anemptyDirmount at/tmp(only theenvconfigmap is mounted under/fluent-operator), the pod can fail to acquire the leader lease or to perform any operation that touches the filesystem. Consider adding anemptyDirvolume mounted at/tmpto keep the read-only root while preserving writable scratch space.
config/rbac/role_binding.yaml:1- The
ServiceAccountsubject no longer specifies anamespace. For aClusterRoleBinding, the subject'snamespaceis required whenkind: ServiceAccount; kustomize's namespace transformer does set it for ServiceAccount resources, but it does not by default rewrite subject namespaces inside (Cluster)RoleBindings unless configured to. Verify the renderedsetup.yamlactually containsnamespace: fluentunder this subject (looking at the diff forsetup.yaml, the ClusterRoleBinding still referencesnamespace: fluent, so kustomize is handling it — please confirm this stays true if a consumer overrides the namespace inmanifests/setup/kustomization.yaml).
config/manager/manager.yaml:1 - Quoting
ALLis unnecessary and inconsistent with the renderedmanifests/setup/setup.yaml(which emits- ALLunquoted). Drop the quotes to keep source and generated forms identical and avoid spurious diffs on regeneration.
controllers/fluentdconfig_controller.go:1 - The previous markers granted
get;update;patchonfluentdconfigs/statusandupdateonfluentdconfigs/finalizers. The new consolidated marker on line 90 covers*/statusbut the/finalizerspermission has been dropped entirely. If the controller still adds/removes finalizers onfluentdconfigsorclusterfluentdconfigs(the previous code did via theupdateverb on/finalizers), reconciliation will fail with a forbidden error. Please verify whether finalizer handling was removed; if not, restore a+kubebuilder:rbac:groups=fluentd.fluent.io,resources=clusterfluentdconfigs/finalizers;fluentdconfigs/finalizers,verbs=updatemarker.
controllers/fluentbit_controller.go:1 - The marker
core,resources=events,verbs=list;watchwas removed here. If the controller's manager still records Events (controller-runtime's default event recorder callscreate/patchon events) or any reconciler watches them, this will produce permission errors. The rendered ClusterRole insetup.yamlno longer grants events at all (only the new leader-electionRolegrantsevents: create;patchin thefluentnamespace, which won't cover cluster-scoped consumers). Please confirm event recording still works at runtime, and if so, retain at leastevents,verbs=create;patchon the operator's ClusterRole or namespaced Role as appropriate.
tests/scripts/fluentd_e2e.sh:1 - The
VERSIONvariable (previously read from theVERSIONfile) was removed, but if other parts of this script or downstream tooling referenced$VERSION, those references will now be unbound. Please confirm nothing else in the file (not shown in the diff) still uses$VERSION.
manifests/setup/setup.yaml:1 - The leader-election
Role/RoleBindingare hard-coded tonamespace: fluent. If a user customizes the install namespace viamanifests/setup/kustomization.yaml(the comment in that file suggests doing so), this resource will be left influentwhile the Deployment moves to the new namespace, and leader election will fail with a forbidden error on leases. Confirm the kustomize namespace transformer rewrites these as expected, and document the requirement to regenerate (make manifests) after changing namespaces.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 54 out of 57 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
config/rbac/role_binding.yaml:15
subjectsfor aClusterRoleBindingthat targets aServiceAccountshould includenamespace. With it omitted, applying this YAML directly would bind to a ServiceAccount in an empty/unknown namespace (effectively granting no permissions). If you intend to rely on kustomize’s namespace transformer to inject it, consider making that explicit (or setnamespace: fluenthere) to avoid standalone usage pitfalls.
| name: health | ||
| protocol: TCP | ||
| securityContext: | ||
| readOnlyRootFilesystem: true |
There was a problem hiding this comment.
Do we need an emptyDir at /tmp for controller-runtime's leader election? Something like:
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
There was a problem hiding this comment.
Controller-runtime's leader election uses Kubernetes Lease objects (coordination.k8s.io/v1) via the API server — it doesn't write to local disk. No /tmp volume is required for it.
Signed-off-by: Chengwei Guo <cwguoz@gmail.com> Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <cwguoz@gmail.com> Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 53 out of 56 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (2)
tests/scripts/fluentd_e2e.sh:83
start_fluent_operatorpipesmanifests/setup/setup.yamlintokubectl create, butsetup.yamlnow contains aNamespace fluentmanifest. Since the script also creates the namespace inprepare_cluster, this will fail with an AlreadyExists error (and abort due toerrexit). Consider either removing the explicit namespace creation inprepare_cluster, switching this tokubectl apply, or filtering the Namespace document out ofsetup.yamlfor the e2e path.
config/rbac/role_binding.yaml:15- This
ClusterRoleBinding’s ServiceAccount subject is missingnamespace. Without it, the manifest is invalid when applied directly (the API requiressubjects[].namespacefor ServiceAccount subjects). Add the namespace here (and let kustomize override it if needed) to keep the RBAC manifests self-contained and valid.
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
…ace before install Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
Wire up the HTTPS :8443 metrics endpoint that the operator already serves by default: enable metrics-auth RBAC so TokenReview/SAR succeed, expose the metrics Service, and align its selector with the operator pod labels. Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
- leader_election_role: drop unused configmaps verbs (controller-runtime uses Leases by default). - network-policy/prometheus monitor: fix podSelector / matchLabels to match the operator pod labels so endpoints actually resolve. - manager_metrics_patch: append the metrics flag instead of inserting at args/0 to be robust against arg reordering. - RBAC: add consistent app.kubernetes.io labels to leader-election and metrics_* roles/bindings; document that hardcoded subject namespaces are rewritten by kustomize overlays. Signed-off-by: Chengwei Guo <chengweiguo@bytedance.com>
|
@joshuabaird please help review again. Thanks. |
What this PR does / why we need it:
Cleans up
config/and consolidate themanifests/setup/setup.yaml,config/default/now is the single canonical kustomize entry point for the production install bundle, and also fixes the kubebuilder markers to generate the wanted roles and so on.Which issue(s) this PR fixes:
Fixes #
Does this PR introduced a user-facing change?
Additional documentation, usage docs, etc.: